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ABSTRACT 


The  two  methods  in  present  use  for  the  analysis  of 
minimum  values,  namely,  a  graphical  method  and  a  method  of 
moments,  are  outlined  and  a  brief  discussion  of  each  is 
given,,  In  addition,  a  m.ethod  using  order  statistics, 
devised  for  maxiimom.  values  is  adapted  to  be  used  for 
minimum  values  in  the  special  case  where  the  lox\rer  limit  of 
the  observed  droughts  is  assum.ed  to  be  zero. 

For  the  general  case,  where  the  lower  limit  is 
assumed  to  be  a  positive  numiber,  a  method  which  combines 
the  m.ethods  of  moments  and  order  statistics  is  proposed* 
Using  this  m.etiiod,  approximate  confidence  bands  are 
obtained  for  the  predicted  droughts* 
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CHAPTER  I 
INTRODUCTION 

I.  History  of  Statistical  Theory  of  Extreme  Values 

The  history  of  the  theory  of  extreme  values  is  not  out 
of  the  ordinary*  Different  authors  using  different  methods, 
independently  made  the  same  discoveries  about  the  same  time* 
It  was  a  case  of  a  common  need  and  a  similar  basic  knowledge 
to  achieve  the  same  results*  Contributions  were  made  by 
scientists  of  Russian,  Crerman,  French,  English  and  American 
origin* 


The  first  work  in  extreme  values  was  done  in 
astronomy.  Astronomers  had  to  decide  what  to  do  vjith  an 
outlying  observation  that  differed  greatly  from  the  rest* 
Another  field  -  gunnery  -  seems  to  be  directly  connected 
with  the  theory  of  extreme  values,  but  there  has  been  little 
or  no  contribution  from  here. 

In  1922,  L.  von  Bortkievricz  published  a  fundamental 
paper  (II)  on  the  distribution  of  the  range  and  on  the 
mean  range  in  samples  from  the  normal  distribution  as  a 
function  of  the  sample  size.  Possibly  his  greatest 
contribution  was  that  he  said  that  the  largest  normal  values 
are  new  variates  having  distributions  of  their  oTm.  This, 
in  fact  was  the  first  clear  statement  of  the  problem  and  also 


led  to  a  new  line  of  attack 
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In  1923  ^  R#  von  Mises  (lli)  gave  the  first  step 
toward  a  knowledge  of  the  asymptotic  distribution  for  normal 
observations,  by  introducing  the  fundamental  notion  of 
"expected  largest  value"  (to  be  defined  later  in  this  chapter) 
which  turns  out  to  be  a  parameter  of  the  asymptotic 
distribution. 

Also  in  1923,  E.  L.  Dodd  (15)  studied  the  extreme 
values  for  distributions  other  than  the  normal  and  was  the 
first  to  calculate  the  median  of  the  extreme  values.  He 
gave  formulae  for  Galton’s  distribution  and  the  Charlier 
series,  as  well  as  a  generalization  of  the  normal. 

The  next  contribution  was  the  "Tippett’s  Tables", 
which  are  the  numerical  values  of  the  probabilities  for  the 
extreme  values  from  a  normal  distribution  for  different 
sample  sizes  up  to  one  thousand,  and  the  mean  range  for  all 
the  extreme  values  from  a  normal  distribution  from  two  to 
one  thousand.  These  were  due  to  L.  H.  C.  Tippett  in 

1925  (16). 

M.  Frechet  in  192?,  (17)  was  the  first  to  introduce 

the  concept  of  a  class  of  initial  distributions  and  also  was 
the  first  to  obtain  an  asymptotic  distribution  of  the  extreme 


values 
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However,  Fisher  and  Tippett  published,  in  1928  (8) 

the  paper  that  is  now  referred  to  in  all  works  on  extreme 
values.  They  obtained  Frechet’s  asymptotic  distribution  and 
constructed  two  other  asymptotic  distributions. 

It  should  be  stated  here  that  the  first  researches 
pertaining  to  the  theory  of  extreme  values  started  with  the 
normal  distribution.  This  actually  hampered  progress  due  to 
the  fact  that  none  of  the  fundamental  properties  of  extremes 
are  related  in  a  simple  way  to  the  normal  distribution. 

2.  Aim 

The  aim  of  a  statistical  theory  of  extreme  values  is 
to  explain  the  observed  largest  or  smallest  values  arising 
in  samples  of  a  given  size  n  ,  valid  for  a  given  period  of 
time,  or  length,  area,  or  volume,  and  to  predict  extreme 
values  that  may  be  expected  to  occur  within  a  sample  size, 
time ,  area  etc • 

Naturally,  this  prediction  does  not  state  that  a 
definite  value  will  occur  at  a  particular  time,  but  rather, 
it  is  the  value  that  is  most  likely  to  occur  within  a 
certain  interval  of  time,  and  gives  limits  within  which  the 
value  may  be  expected  to  lie  with  a  certain  probability. 

There  ^re  three  essential  conditions  that  should  be 
fulfilled  in  applying  statistical  methods  to  analyse  extreme 


value  data* 
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(1)  The  variables  to  be  considered  are  statistical 
variables. 

(2)  The  initial  distribution  from  which  the  samples 
are  drawn,  and  its  parameters ^must  remain  constant  from  one 
sample  to  the  next. 

(3)  The  observed  values  should  be  extreme  values  of 
samples  of  independent  data. 

Gumbel  (reference  (I))  points  out  that  the  third 
condition  is  not  too  critical  for  the  follOTd.ng  reasons; 

(a)  Since  the  actual  samples  used  in  practical 
applications  are  usually  quite  large,  it  is  possible  to 
delete  a  large  number  of  the  observations  which  may  be 
considered  to  be  dependent,  thus  leaving  a  sample  which  is 
still  of  sufficient  size  and  that  would  now  contain 
independent  data. 

For  example,  in  dealing  with  droughts,  which  are 
defined  as  the  minituum  of  the  365  daily  discharges  in  a 
year,  it  would  be  possible  to  obtain  100  or  200  observations 
that  would  be  independent. 

(b)  The  second  reason  is  that,  as  in  so  many  other 
situations  where  the  underlying  causes  can  only  be 
imperfectly  known  or  assumed,  the  analysis  of  data  does  not 
wait  upon  the  development  of  the  most  elaborate  theory 
possible,  but  proceeds  upon  the  theory  built  up  from  simple 
assumptions.  Very  often  the  only  procedures  available  are 
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those  based  on  independence  and  hence  if  the  saj-nples  are  large, 
it  is  often  considered  safe  to  proceed  as  though  the  data  were 
actually  independent. 
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3o  Exact  and  Asymptotic  Distributions  for  Smallest  Values 
(a)  Exact  Distributions 

Let  F(x)  be  the  probability  that  a  value  of  the 
variate  X  is  less  than  or  equal  to  x  ,  that  is,  P(X  <  x), 
and  let  f  (x)  =  F»  (x)  be  the  density  of  probability  at  x  . 
This  f(x)  will  be  referred  to  as  the  "initial  distribution". 
Then 


P(X  >  x)  =  1  -  F(x)  . 


The  probability  that  n  independent  observations  on  X  are 
all  greater  than  x  is 


which  also  gives  the  probability  that  the  smallest  among  the 
n  independent  observations  is  greater  than  x  ♦ 

Therefore  the  probability  that  the  smallest  among  n 
independent  observations  on  X  is  less  than  or  equal  to  x  is 


(1,1)  I  ^(x)  =  1  -  [  1  .  F(x)]  ” 


1 


Since  the  main  interest  here  is  the  analysis  of  drought  data, 
only  smallest  values  will  be  considered. 
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aiid  its  derivative 

(1.2)  4(x)  =  i  '(x)  =  n  [  1  -  F(x)J  f(x) 

is  the  distribution  of  the  smallest  among  n  independent 
observations.  Equation  (1.2)  forms  the  basis  for  the 
whole  exact  theory  of  smallest  values. 

(b)  Asymptotic  Distributions  for  Snallest  Values 

Obviously  equations  (1.1)  and  (1.2)  depend  on 
knowledge  of  the  initial  distribution  f(x),  which  is  usually 
not  known.  In  order  to  deal  with  smallest  values  their 
asymptotic  distributions  were  obtained. 

An  important  step  in  the  development  of  the 
asjrmptotic  distributions  was  made  by  R,  von  Mises  (13)  ^ 
who  introduced  the  folloxd.ng  distinction: 

"A  continuous  variate  may  be  either  limited  or 
unlimited  in  the  direction  of  interest.  If  it  is  unlimited 
the  moments  may  or  may  not  exist.  Thus  there  are  three 
categories. 

First  those  distributions  which  are  unlimited  and 
where  all  moments  exist.  Second^  unlimited  distributions 
where  only  a  finite  number  of  moments  exist,  and  third, 

Ijjnited  distributions .  ” 

These  three  categories  give  rise  to  three 
different  types  of  initial  distributions  from  which  extreme 
values  ma^r  be  taken; 
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lype  I;  If  the  probability  function  F(x)  converges  with 
increasing  x  toward  unity  at  least  as  quickly  as  an 
exponential  function,  then  F(x)  is  said  to  be  of  the 
exponential  type.  An  exact  defiriition  of  this  type  is 
obtained  from  R,  von  Mises’  method  for  developing  the 
asymptotic  distribution  for  this  type.  He  derived  this 
asjTTxptotic  distribution  under  the  condition  that; 


(1-3) 


lim 

X  00 


^1  ^  FCxfl) 

L  f(x)  JJ 


0 


All  initial  distributions  possessing  this 
property  are  said  to  be  of  the  exponential  type. 

The  protot^rpe  is  the  exponential  distribution 
itself,  while  other  distributions  of  this  tjrpe  are  the 
normal  and  the  chi-square  distributions. 

Type;  II;  A  distribution  belongs  to  this  type  if  the 
following  property  is  satisfied; 

(l.U)  [l  -  F(x)j  y  =  A5  A>0;k>0 

where  A  is  a  constant,  and  the  distribution  function  F(x) 
posses^So  moments  of  order  greater  than  k  .  The  prototype 
here  is  the  Cauchy  distribution  ' ‘  and  consequently  it 

is  called  the  Cauchy  Type. 
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Type  III:  If  the  variable  y.  of  the  distribution  .function 

F(x)  is  liDiited  in  the  direction  of  interest,  then  the 
function  F(x)  is  said  to  belong  to  the  third  type. 

The  asymptotic  distributions  for  these  three 
tj^es  were  found  by  R.  A*  Fisher  and  L,  H«  C.  Tippett  (8) 
in  1928,  The  results  of  this  paper  are  given  here  and  will 
be  used  in  the  main  body  of  this  thesis, 

(a)  For  the  exponential  ty'pe,  the  asymptotic  distribution 
of  the  smallest  value  turns  out  to  be: 


where  =  a(x-u) 

is  known  as  the  reduced  variate,  ^  (x)  is  the  probability 

that  a  drought  will  be  more  severe  Jthat  is  numerically 

s>i>aller)  than  x  .  The  parameter  u  is  the  mode  of  the 

distribution  and  i  is  a  scale  parameter  which  is  ^ 
a  tt 

times  the  standard  deviation  of  the  distribution, 

(b)  The  smallest  values  from  a  distribution  of  the  Cauchy 
(Type  11)  have  the  folio-wing  asymptotic  distribution: 


(1.6)  tt(x) 


1  -  exp 


[  -If] 


u<0;  k>0;  x<0 

(u  ic  not  j-Ut  vv\o 


where  the  initial  distribution  possesses  no  moments  of  order 
greater  than  k  , 
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(c)  The  third  type  has  the  variate  X  limited  by  some  lower 
limit  and  leads  to  the  following  asymptotic  distribution: 


where  £  is  the  lower  limit;  k  is  ...  the  order  of 

the  lowest  derivative  of  the  probability  fiinction  that  does 
not  vanish  at  x  -  0, 

P(x)  gives  the  probability  that  an  X  value  will 
be  less  than  or  equal  to  x  • 

5.  Return  Period 

A  concept  commonly  used  in  the  treatment  of 
smallest  values  is  that  of  ’’return  period”. 

If  F(x)  =  P(X  <  x)  ,  then  its  reciprocal 

''•®)  .  F(x) 

is  known  as  the  return  period  of  x  ,  This  gives  the 
average  number  of  observations  necessary  to  obtain  one  value 
less  than  or  equal  to  x  ,  if  the  observations  are  m.ade  at 
constant  intervals  of  tim.e* 
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chapter  II 


2.1  Introduction 

The  theory  of  extreme  values  was  treated  by  E.  J.  Gumbel 
in  a  series  of  lectures  published  by  the  United  States  Bureau 
of  Standards  in  February,  195U*  This  publication  deals  mainly 


loads  in  aeronautics,  etc.,  but  states  that  the  same  method 
can  be  used  for  analysing  smallest  values. 

Later  in  May  of  the  same  year.  The  Araerican  Society  of 


directly  with  droughts,  under  different  basic  assumptions. 

In  this  chapter,  a  brief  outline  of  these  two  methods  wi3.1  be 
given,  with  examples  of  each. 

2 , 2  Gumbel ’s  First  Method 

Gu.mbel^s  first  approach  to  the  problem  of  analysing 
s:mallest  values  assumes  that  the  initial  distribution  is 
unlimited  to  the  left,  and  that  the  asymptotic  distribution 
of  the  exponential  t37pe  (see  equation  (1.5))  given  by: 


(2.1)  F(x)  =  1  -  exp 


where  -^y  =4a(x-u)  ,  is  assumed  to  apply.  F(x)  is  the 
probability  that  a  value  of  the  variate  X  will  be  less  than 
or  equal  to  x  .  Speaking  in  terms  of  droughts,  F(x)  gives 
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the  probability  that  a  future  drought  will  be  more  severe 
(that  is  numerically  smaller)  than  x  .  u  and  a  are  the 
parameters  discussed  in  chapter  I,  which  must  be  estimated* 

Por  this  distribution  the  return  period^  defined  by 
(1.3)  is  given  by 


This  gives  the  average  number  of  observations  necessary  in 
order  to  obtain  a  drought  as  smally^or  smaller  than  x  ,  if 
the  observations  are  made  at  constant  intervals  of  time. 

The  method  is  essentially  a  graphical  one  which  uses 
probability  paper  (first  proposed  by  Powell  (9))  especially 
designed  for  the  treatment  of  extreme  values.  A  discussion 
of  the  construction  and  use  of  probability  paper  in  general 
is  given  in  Appendix  A^, 

In  order  to  use  this  special  graph  paper,  the 

observations  are  first  ordered  in  decreasing  magnitude  and 

then  placed  on  the  vertical  axis  of  the  paper  which  is  scaled 

linearly.  The  problem  then  arises  as  to  the  frequency  at  which 

the  mth  value  x  should  be  plotted.  Since  the  observations 
m 

are  ordered  in  decreasing  magnitude,  should  be  plotted  at 
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(2.3)  P(X  >  x^)  =  1  -  (yj  =  1  -  F(x^) 

Gumbel  suggests  that  the  average  proportion  of  the  popfulation 
f(x)  exceeding  should  be  used.  That  is,  he  puts 

(2,U)  1  -  F(x^)  =  1  -  f  (y^)  = 

where  — ^ —  is  the  e^qoected  value  of  the  Drooortion  of 

N  +  1  — i— -  ^ 

the  population  f(x)  exceeding  x^  .  (This  is  derived  in 
Appendix  A.) 

If  the  points  (  ,  x  )  are  plotted  on  the 

N  +  1 

probability  paper,  they  should  be  scattered  about  the  straight 
line 

(2.5)  X  =  u  -  2 

Corresponding  to  each  observation  x^  ,  there  will  be 
a  return  period  T(x^)  given  by  (2,2),  An  a^ds  for  these 
return  periods,  scaled  accordingly,  is  included  along  the 
top  of  the  graph  paper  so  that  the  return  period  of  any  sized 
drought  can  be  read  directly. 

The  second  problem  arising  from  the  use  of  this  probability 
paper  is  that  of  fitting  the  straight  line  (2,.^)  to  the 
plotted  points.  Since  the  relationship  between  x  and  y  is 
linear,  the  classical  method  of  least  squares  can  be  used. 
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The  two  regression  lines  y  on  x  and  x  on  y  can  both 

be  fitted  and  each  will  give  estimates  of  the  two  parameters 

u  and  -  •  Gumbel  combines  the  two  estimates  of  u  by 
a 

taking  their  geometric  mean*  This  gives 

(a.C.)  u  =  X  + 


^(N) 


which  is  used  as  the  estimate  of 


mean  of  the  two  estimates  of 

a 


(a -7) 


(x) 


(H) 


n  . 
gives 


Similarly  the  geometric 


as  the  estimate  to  be  used  for  i  ,  where  s,  .  and  x  are 

a  »  (x) 

the  standard  deviation  and  mean  respectively  of  the  sample, 

and  y,  .  and  <3“  are  the  "theoretical’’  mean  and  standard 
(N)  (M)  1 

deviation  of  y  given  by 


(a.  8) 


(N) 


=  I  ^ 


and 


(H) 


which  are  dependent  on  the  sajnple  size  only,  and  have  been 
tabulated  in  table  II  of  reference  (2)  «  Using  these  estimates 
for  the  parameters  the  straight  line  (^*5)  can  be  drawn  on 
the  graph. 


1 

and  neither  statistics  (since  they  do  not 

depend  on  the  observations)  nor  purely  population  values 
(since  they  depend  on  N  ) ♦  Gumbel  refers  to  them  as  the  "expected" 
reduced  mean  and  the  "expected"  reduced  standard  deviation. 


f' jod  ii.no  V  r?o  X  ;:;r-  x  rK)  OV'' 

cv.j  oiiir  do  o I- 3,3  .^Cd''  "■■■'-■>  hr:z  .hect'-:-.:'.  -v'! 

^  r  ':n  aa<^: G ::>  c  ;.•)  odd  3.--ij:cfer;  •  d.':d;.  "  »  -  hn^  - 

ao  vd;,;.;  ai;:f  *'fG.v  g  dxd^joX'O--.  xrO'  ’’-  r. , t 


■dJ-orriodd  ordi-  vl-dlx.-t;'  ,  do  edAddBo  'o,.-  .baoo  ad  ;--d - 

■  ,  .  .-.  .  \  ■  ' 

'  3ovx^  d  xo  ^'.o'.z.'-'zo'oo  0G.i  oAd  do  [i3Si: 


X  bno  f  G'lOG'^-  j  -  go::  xOiO';  o-*  0./  ...r  -'i.dyas  offd  go 

^  "•■  o:,.G  Gfd  do  toOqaOG:  aO  ):;r  rf  ■  o<-.l.'-orvoL  b-fOoi-X  OG 

■•'i.  --jo  /o-id  no-oi'  ’doo£j';'i:of»dj"  odo  odO  ""<>  hoB  v  bn^ 

'  .  ,  (Ifi)  "  ■ 

VC1  ■’<ov  ':o  d  10  no  jjBdvob 

„  I 


iT.Q 


.X 


■  '"  •  •  *t  =...??  5,  oxG.G  ■'-\rq^fi'i,.'G’  6r[o'  nC'  -jT  9_'  o'i3  ..osiXi 

:’jao  OGodj  q;ij;:o'  *  j;  ■^oGOG:el^^o:  lo  II  oXoBj  ol  I-xl  j 
rovG'O  -^r  noo  .  ot-i:!  do;, ro-x js  ddX  oosjorodiBq  odcr  'rod. 


«  '-vB‘xo,  onj 

,  j  , 


m-  ocri;':.'  go '-j'-oav  ■  jo  ‘xoojjBir  o-.-o  ^,rt)  '  l.o--,  .1 

.  mr  -y. 

V  ,  ■  - 

'  ■'  •'•’Ov  j-> : '  JO-:’  'I r:- .  t.'»n  'iGrto.r'b-ojxoado  orl.j'  f,io  ‘  ■O'’oc;-jb 

■  '•  )rt:-  0,^  '•  .  ;»  bnaqsb  9on,i:3) 

■■  '.-■■'.1:  boo!.!b‘3i  "keio^'M"  ,:r',  r.oar;  fcnorbo';: 


-lU- 

The  third  problem  arising  from  the  use  of  this  special 

paper  is  that  of  establishing  confidence  bands  for  the 

theoretical  straight  lineo  For  this  purpose  the  distribution  of 

the  mth  value  x  is  usedo  Under  not  very  restrictive 
m 

conditions  it  can  be  shown  that  any  mth  value  in  the 
neighborhood  of  the  median  is  asjnnpt otic ally  normally 
distributed  about  a  mean  given  by  (2ol4-)  and  with  standard 
deviation 

(a.9)  fs  (x^)  = 

where  stands  for  the  density  of  probability 

at  the  value  ,  and  is  defined  by  (2.U)o  However  since  the 
approximation  of  the  exact  distribution  of  the  rath  value 
becomes  weaker  and  weaker  as  the  deviation  from  the  median  gets 
larger j  it  should  be  noted  here  that  (2*9)  gives  valid 
estimates  for  the  standard  error  of  x^  only  for  probabilities 

0ol5  <  1  -  F(x  )  <  0*85 

m 

To  obtain  numerical  values  for  <r"  }  the  standard 

deviation  of  the  reduced  variate  y  (which  has  density  ^  (y)  « 
is  introduced: 

(2.10)  Vi  cr-  (y„)  = 

^  (y) 

which  can  be  tabulated  as  a  function  of  y  and  has  no  dimension. 


/[f(x^)]  [  1  -  F(x^)] 
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(See  table  3.1].  ,  reference  (l)).  Having  these  values  for 
CT"  ,  C —  (x^)  can  be  obtained  from: 

(2.11)  <T-  (x^)  *  ^  ^ 

Vi?  a 

To  obtain  the  confidence  bands,  these  values  of 
0~*  (Xjj^)  are  added  to  and  subtracted  from  the  theoretical 
values  ,  situated  on  the  straight  line  (2.5) «  This  gives 
a  probability  of  0.6827  that  each  mth  value  will  be  contained 
in  the  interval  thus  obtained.  If  a  larger  probability  is 
desired,  two  standard  deviations  are  added  to  and  subtracted 
from  the  theoretical  values.  This  raises  the  probability  to 

0.95U5. 

However,  as  stated  above,  the  standard  errors  given  by 
(2.11)  are  valid  only  in  the  neighborhood  of  the  median  and 
hence,  an  extension  is  needed  for  the  control  cur^/es  to  include 
the  very  smallest  values.  To  do  this,  Gumbel  utilizes  the 
asymptotic  probability  distributions  of  the  smallest  and  second 
smal3.est  values. 

If  the  initial  distribution  is  of  the  exponential  type, 
it  has  been  shorn  (reference  (18))  that  the  distribution  of  the 
mth  largest  observation  (from  above)  converges  toward 
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where  y  =  a  (x  -  u  )  stands  for  the  reduced  variate  from 
m  m  m  m 

the  population  consisting  of  mth  largest  values. 


Therefore  the  as3rmptotic  distribution  of  the  smalD.est  value  is 


<f>  (x  )  =  a 


(H-l)l 


exp 


-N  yjj  -  Ne 


and  for  the  second  smallest  value  is 


(N-1)  -  (N-1)  e 

In  order  to  extend  the  control  curves  Gumbel  has  shoim 
(reference  (1))  that  the  interval  obtained  by  adding  and 
subtracting  the  value 


\ 


(K-1) 


N-1 


N-1 


(N-2)! 


e:!p 


(2,12) 


1,114.07 


N 


to  and  from  the  theoretical  smallest  value  x^^  situated  on 
the  straight  line  (2,5)  >  will  contain  the  observed  smallest 
value  with  probabiJ-ity  eoual  to  0.6827, 

Similarly  the  interval  obtained  hy  adding  and  subtracting 


(2. 12a) 


0,75Ul 


KT-1 


to  and  from,  the  theoretical  second  smallest  value  x  will 

I^I-l 

contain  the  second  smallest  observation  mth  the  same  probability. 
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should  be  estimated  by  considering  a  sample 

made  up  of  the  smallest  (second  sm.allest)  values  from,  many 

samples  of  size  N  •  Hox^ever  in  practice  this  is  usually  not 

available.  Gumbel  uses  the  estimate  for  —  obtained  from 

a 

(2*7)  as  the  estimate  for  both  ~  and  -i-  . 

If  the  points  obtained  by  utilizing  (2,12)  and  (2.12a) 
are  joined  with  the  previously  obtained  bands ^  smooth  curves 
result  and  there  is  probability  equal  to  0,6327  that  these 
curves  will  contain  the  plotted  points.  If  a  probability  equal 
to  Oo95U5  is  desired,  the  values  added  and  subtracted  to  and 
from  the  smallest  and  second  smallest  values  are 


(2.13) 

a 


and 


1.7820 

a 


respectively. 

For  extrapolation  purposes,  Gumbel  has  applied  the 
principle  of  confidence  bands  to  return  periods.  He  has  shorn 
that,  with  probability  0.6827  ,  a  drought  x  will  occur  for 
the  first  time  between 

(2,1)4)  0.32  T  and  3.13  T 


where  T  is  the  return  period  corresponding  to  the  drought  x 
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2.3  Example  Using  Gumbel^s  First  Method 

In  order  to  illustrate  the  method  outlined  above,  the 
following  example  is  worked  out.  The  drought  values  are 
observations  on  a  certain  river,  call  it  ’’River  R”,  during  a 

17  year  period.  They  represent  the  minimum  flow  of  water  past 
a  particular  point  on  the  river,  call  it  ’’Point  P”  during  each 
of  the  17  years.  The  values,  in  the  order  in  which  they  were 
observed,  are  given  in  the  second  column  of  table  2.1  , 

Calculations ; 

(1)  The  observations  are  ordered  from  above  and  the  17 
plotting  positions  are  obtained  by  calculating  the  fractions 

—  where 

18 

m  “  1,  2,  1^  . 

(2)  The  points  — ,  X  are  then  plotted  on  extremal 

N  +  1  ^ 

probability  paper  with  the  observed  values  as  ordinates,  and  the 

fractions  -21  as  abscissae.  (See  figure  2.1). 

18 

(3)  In  order  to  fit  the  theoretical  straight  line 

x»u  -  Z 

a 

to  the  plotted  points,  the  mean  x  and  standard  deviation  s^ 
must  be  calculated.  Having  obtained  these,  the  estimates 
for  the  parameters  u  and  i 


are  obtained  from 
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Table  2*1  :  Drought  observations  at  Point  P  on  River  R  over  a 

17  year  period  during  the  first  quarter  of  each 
year 


Yr. 

Observations 
(as  observed) 

Observations 

(ordered) 

1  _  f(y)=  ” 

(table  1) 
y(  ref.  3) 

1 

367 

S 

925 

0.0556 

-lo69 

2 

358 

Xg 

» 

75o 

O.lil 

-0.81 

3 

252 

^3 

s 

6o5 

0.1667 

-0.58 

k 

150 

s 

573 

0.222 

-0.1^2 

? 

6o5 

"5 

s 

563 

0.278 

-0.28 

6 

293 

*6 

ss 

U30 

0.333 

-O.lu 

7 

339 

s 

367 

0.389 

0.08 

8 

573 

s 

358 

0.1;l45 

0.22 

9 

750 

Xp 

= 

339 

0.500 

0.39 

10 

925 

^0 

=s 

293 

0.556 

0.52 

11 

563 

=^1 

270 

0.612 

0.70 

12 

270 

^2 

a 

252 

0.667 

0.90 

13 

13k 

^3 

as 

187 

0.723 

1.12 

lU 

h30 

a 

170 

0.777 

1.33 

15 

no 

^5 

a 

150 

O.83U 

1.70 

16 

119 

’^16 

a 

13li 

0.889 

2.15 

17 

137  - 

=^7 

s 

119 

0.91*1* 

2.87 

e 


\i 
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1  _  fx 

a  "  “  CTM  ’ 


where  (j—  ^  and 
reference  (2)  , 


y.  can  be  obtained  from  table  II  of‘ 

(N; 


For  this  example. 


Cr„  =  l.OliTit  and  — — -  =  0.5172 

17  y(17) 


=  22lu79 


X  =  381.47 


Therefore 


224.79 

1.0474 


=  214.62 


u  =  381.47  +  (0.5172) (214.62) 


492.47 


and  the  theoretical  straip-ht  line  become! 


X  =  492.47  -  215  y 


xsrhich  is  then  plotted  on  the  paper, 


(4) 


The  confidence  bands  are  obtained  by  first  finding 


cr“(x  )  for  different  y_  values  from 


^  cr  (y  ) 

O'  iX  )  =  - - - - il- 

Vf  a 


where 


must  be  calculated  and  ^  (y^)  is  obtained 
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from  table  3*U  of  reference  (1)  ,  which  is  given  here  in  the 
first  three  columns  of  table  2,2  .  These  values  are  added  to  and 
subtracted  from  the  x  values  situated  on  the  straight  line,  that 
correspond  to  the  selected  values,  for  m  not  greater  than  l5« 

For  m  =  16  and  m  =  17  ,  the  values 

0,7541  ,  iaU07 

- -  and  - - 

a  a 

are  added  to  and  subtracted  from  the  theoretical  x  values 
corresponding  to  and  ,  Using  the  estimate  of  i 

already  obtained,  these  values  are  calculated  and  for  this 
example  are  included  in  table  2,2  ♦ 

Table  2,2  Standard  errors  (x  )  of  the  mth  values  x  , 
to  be  used  as  confidence  band  half-widths  for  values  of  m  up 
to  15.  The  values  to  be  used  for  m  ®  16  and  m  =  17  are 
included  in  column  U  . 


y  <r'(yn,)vW 


v^q~(ym) 


Confidence  band  half¬ 
widths  for  the  smallest 
and  second  smallest 
values. 


-  0,5 

1.2ii31 

59.2 

0,0 

1.3108 

62,2 

0,5 

1.3057 

71.5 

1,0 

1.8126 

86,2 

1.5 

2.2U08 

.  106,5 

2.0 

2.8129 

133.7 

2.15 

2,87 


0>759h  =  153,1 

a 

a 


1 


c 


■  o 


c. 


n  ■  "  -  ■ 


'■•C' 


o;!' 


.  I 


/.60?p  jp  4.P  50  /Q  20  5.0  4P5P  /O.O  200  500  /OOP 
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2»1j.  The  Second  Method  Proposed  by  R,  J.  Gumbel 

This  nethod  takes  into  account  the  fundamental  difference 
between  floods  and  droughts.  For  droughts  the  lower  limit  must 
be  assumed  to  be  either  zero  or  some  positive  number.  In  the 
previous  method  drought  was  treated  as  being  unlimited  to  the 
left,  which  of  course  is  unrealistic.  Since  the  initial 
distribution  now  under  consideration  is  a  limited  one  in  the 
direction  of  interest,  the  asymptotic  distribution  of  the  third 
type,  given  by  (1,7)  (with  k  replaced  by  a) 


(2,15)  P(X  <  x)  =  P(x)  =  1  -  exp 


a  >  0  ;  £  >  0 


X  >  £  ;  u  >  £ 


is  used. 

Case  I  :  Lower  liirdt  zero 

If  X  represents  drought  observations  and  if  £  is  taken 
to  be  zero,  then  (1*7)  becomes: 


(2,16) 


which  gives  the  probability  that  an  observed  drought  will  be 
less  than  or  equal  to  a  particular  x  , 
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The  drought  x  =  u  is  that  value  that  will  be  exceeded 
36 ♦788^  of  the  time  and  Gumbel  suggests  that  it  be  used  to 
characterize  a  given  river*  It  is  therefore  called  the 
"Characteristic  drought". 

If  the  probability  is  taken  to  be  ~  ,  the  median 

drought  X  can  be  shown  to  be  given  by 

1 

(2ol7)  X  =  u(ln  2)“  . 

The  m*ode  x  ,  obtained  after  two  differentiations  of 
(2.16)  turns  ou.t  to  be: 

1 

(2.18)  X  =  u(l  -  i)“ 

which  is  smaller  than  the  characteristic  drought  u  •  Since 
X  must  be  positive,  a  mode  exists  only  if  ~  <  1  • 

If 

i  =  (1  -  In  2)  =  0.30685  , 

then  the  mode  x  equals  the  median  x  and  the  distribution 

is  nearly  symmetrical. 

T  (<)  (exceed) 

If  “  >  Oo 30685  }  the  mode  will  precede  the  median. 

These  facts  determine  the  general  shape  of  distribution  (2*16) 

and  are  illustrated  in  figures  2.2,  2*3,  2*1!.  . 
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In  order  to  analj^e  drought  data  if  the  lower  limit  is 

be. 

assumed  to^zero,  the  following  transformation  is  made  in  (2*l6) 


(2*19)  Itit  X  =  e^  and  u  =  e"^ 

1 

(2*16)  now  becomes: 

^  \  r  tt(z  -  v) 

P2(x)  «  1  -  exp  I  -e 

(2*20)  which  may  be  written  as 
P^(x)  =  1  -  exp 


^  A  discussion  of  the  distribution  (2*16)  under  a  similar 
transformation  to  (2.10)  is  given  in  Chapter  3* 
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where 


(2.20a)  -y^  «  a(z  -  v)  =  a*^og  x  -  log  u) 


where 


(2.21)  a*  «  log^lO  a  =  2.3026  a 

Since  the  variable  y^  is  a  linear  function  of  log  x 
and  has  the  same  cumulative  distribution  function  as  the  y 
in  (lo5)>  the  graphical  method  described  in  the  first  part  of 
this  chapter  may  be  used  here  as  well.  The  only  difference  is 
that  here,  the  common  logarithms  of  the  droughts  instead  of 
the  droughts  themselves  are  plotted  against  the  yi  valu.es. 

The  probabilities  return  periods  T(x)  given  by 


1 


(2.22)  T(x)  = 


1  -(if 


1  -  exp 


are  also  plotted  on  the  extremal  probability  paper  as  before. 
Instead  of  estim.ating  the  parameters  u  and  i  as 


However,  the  estimates  are  obtained  by  the  same  methods  and 
are  found  to  be 
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where 


and  GT, 


CN) 


are  the  reduced  mean  and  reduced  standard 


deviation  obtained  from  (2.8)  (tabulated  on  page  h39  -  6 
of  reference  2). 

Finally,  the  theoretical  droughts  are  obtained  from  the 
graph  of  the  straight  line 

(2.2i|)  log  X  =  log  u  -  I'j 

Case  2;  The  lower  limit  not  equal  to  zero 

Consider  the  general  case,  that  is  where  the  lower  limit 
is  some  positive  number  f  .  Once  again  the  cujnulative 
distribution  function 


r 


X 

u  -  e 


P(x)  =*  1  -  exp 


given  by  (1.7)  is  used,  where  £  becomes  the  third 
parameter  to  be  estimated. 

For  a  graphical  representation  of  this  case  the 
transformation 


(2.25)  log  (x  -  £  )  =  log  (u  - 


is  used.  However,  the  relationship  between  y  and  log  x 
is  no  longer  linear.  Letting  In  x  represent  the  natural 
logarithm  of  x  ,  we  see  that 
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d^(ln  x) 
d  y2 


d  1  dx 

dy  X  dy 


But  from  (2,25) 

-y 

X  =  (u  -  £  )  e  ^  *  C 


giving 


dy 


-  i  (xi  -  o 

a 


X  -  8 


a 


Thus, 


(2.26) 


d^(ln  x)  _  ^ 
dy2  ’  dy 


£ 

a  X 


liirf  >  0  . 


Therefore,  if  log  x  is  plotted  against  y  the  resulting 
curve  is  bent  downward. 

Since  the  previous  graphical  estimate  of  the  parameters 
is  not  possible,  the  classical  method  of  moments  is  used. 
Differentiating  (1.7) ^  the  density  function  p(x)  is  obtained 

PM- -^p  [-(Hr)”’ 

Therefore,  the  kth  moment  of  I- — is  given  by 


b 


/ 


t  '^-XO  I 


f- 
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(2.28) 


r  (1  *  i: 


Therefore,  the  first  three  moments  are  given  by 


m 

(2.29)  and 


.  T  (1  .  |) 


-  T  (1  .  |)  • 


The  variance  ^  of  (x  -  £ )  is 


(2.30)  cr'"=  (u  -  o"  j^rd  ^  -  r"  ^ 


and  the  third  central  moment  yU^  of  (x  -  6  )  is 


(2.31)  yU.^  =  (x  -  £  )3  -  3(x  -  £  )  (x  -  £)  +  2(x  -  £  ) 


Using  (2,30)  and  (2.31),  the  skewness  \/p7  obtained 


since 


(2.32)  VpT  =  A3^ 


-3 


I 


■:1 


■■  'I 


V- 


r; 
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Therefore, 

(2,32a) 


Td 


i) 


3r(i 


-  a)  T(1 


2r^(i 


This  expression  depends  only  on  i  and  hence  if  VPi 
replaced  by  the  sample  value  \/bx  ,  an  estimate  of  ~  can 
be  obtained.  (Vpi  ^.re  tabulated  in  reference  2  for  different 
values  of  ^) • 

To  estimate  u  ,  (2,30)  is  used.  The  relationship 


(2,33)  n  =  X  +  <r“  A(a) 


where 


(2.33a)  A(a) 


1  -  r  (1  +  i) 


r(i  + 1)  -  r®a 


* 


is  obtained^and  since  —  has  already  been  estiraated  an 
)  a. 

estimate  of  u  can  be  obtained  if 


s  =  (x^  -  X  )^ 


is  used  as  an  estimate  of  0^ 


To  estimate  B  ,  equation  (2,30)  is  written  in 


X  -  -u  r  (1  + 

1  -  T  (1  + 1) 

a' 


the  form 
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and  X  is  replaced  by  its  value  from  (i^o33)  giving 

(2*3U)  £  =  u  -  <s“  B(a) 


where 


(2.3Ua)  B(a)  = 


ra  + 1)  -  T  "(1  +  k 


1 

2 


and  is  also  tabulated  in  reference  2  »  The  estimates  of  u 

and  —  already  obtained  are  used  along  with  the  sample 
a 

standard  deviation  s  for  * 

A  criterion  as  to  whether  the  lower  limit  should  be 
taken  as  zero  or  not  is  established  from  equations  (2.33)  and 
(2.31}.)  since 

£>0  if  x+s|^  A(a)  -  B(a)j  >  ^  • 

A  more  convenient  form  of  this  condition  is 

(2.3?)  i>0  'if  ^  ^ 

X  r  (1  *  p 

a 

If  the  equality  is  fulfilled^  the  lower  limit  is 
taken  to  be  zero.  If  ^  turns  out  to  be  negative  but  small, 
it  can  safely  be  assumed  to  be  zero. 
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i)ecyi 

After  the  three  parameters  have/) estimated,  the 

theoretical  droughts  are  obtained  from  (2*25)  and  may  be 

plotted  against  y  on  logarithmic  extremal  probability 
1 

paper  «  The  expected  droughts  for  the  desired  return  periods 
can  easily  be  read  off  the  graph* 

2*5  Example  of  Drought  Analysis  Using  the  Method  of  Moments 
Case  It 

The  lower  limit  £  is  assumed  to  be  zero*  Table  2*3 
gives  the  droughts  observed  at  Point  P  on  River  R  over  a  17 
year  period,  their  logarithms  and  the  frequencies  at  which 
they  are  to  be  plotted* 

If  logarithmic  probability  paper  is  available  the 
droughts  themselves  are  plotted  against  the  frequencies  , 

as  in  figure  2*5*  If  ordinary  extremal  probability  paper 
is  being  used,  the  logarithms  of  the  droughts  are  plotted,  and 
the  theoretical  straight  line 

y 

log  X  =  log  U  - 

a ' 

is  fitted  to  the  plotted  points. 

The  estimates  of  log  u  and  are  obtained  as 

a* 

solutions  of  (2*23) 


1 

This  paper  differs  from  ordinary  extremal  probability  paper 
in  that  the  axis  corresponding  to  the  drought  values  is  scaled 
logarithmically. 
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i  «  x) 
<rN 


log  u 


^(N) 


log  X 


For  this  example : 


log  X  =  1o9667  ;  s(log  x)  =  (log  x)^  -  (log  x)^  =  2»005 


and  are  obtained  from  table  II  of  reference  (2)  to  be 


17 


l.OUll  3 


0.^181 
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Table  2,3:  Droughts  observed  at  Point  P  on  River  R  over  a 
17  year  period 


Droughts 


Ir. 

X 

(as  obsv.) 

X 

(ordered) 

Log.  X 
(ordered) 

m 

W  +  1  m  =  1,..17 

1 

76 

189 

2.2765 

0.0556 

2 

57 

182 

2.2601 

0.111 

3 

51 

169 

2.2279 

0.167 

k 

50 

lh2 

2.1523 

0.222 

5 

182 

123 

2.0899 

0.278 

6 

189 

122 

2. 0861* 

0.333 

7 

123 

115 

2.0607 

0.389 

8 

108 

113 

2.0531 

0. 1*1*1* 

9 

2M2 

108 

2.0331* 

0.500 

10 

169 

80 

1.9031 

0.556 

11 

113 

76 

1.8808 

0.611 

12 

68 

68 

1.8325 

0.667 

13 

115 

57 

1.7559 

0.722 

Ih 

122 

52 

1.7160 

0,777 

1? 

52 

51 

1.7076 

0.333 

16 

50 

50 

1.6990 

0.889 

17 

80 

50 

1.6990 

0.91*1* 
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Therefore,  the  required  estimates  are 


1 

a* 


0.200S 

loOUii 


0.1926 


and 


logu  -  1.9667  +  (0.5181)  (0.1926) 

»  2.1665 


The  straight  line  then  becomes 

log  X  »  2.1665  -  0.1926  y 

and  this  line  is  drawn  on  the  probability  paper  as  in 
figure  2.5  . 

The  return  period  of  any  drought  can  be  read  from 
this  graph.  For  example  the  return  period  of  the  drought 
25  cu.  ft. /sec.  is  obtained  to  be  approximately  29.5  years. 

Case  2*  The  lower  limit  ^  not  zero. 

The  droughts  measured  in  River  R  during  the  second 
quarter  of  each  of  17  years  are  analysed.  Table  2Jx  gives 
the  observed  values  and  some  of  the  preliminary  calculations 
required,  and  Table  2.5  gives  the  remaining  calculations 
needed  to  estimate  the  three  parameters  u. 
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Table  2.U:  Droughts  observed  at  Point  P  on  River  R  during 

the  second  quarter  of  each  year  over  a  17  year 
period* 


Yr* 

X 

x2 

x3 

1 

126 

15876 

2,000,376 

2 

161 

26896 

l;,!;10,9lO; 

3 

11? 

13225 

1,520,875 

k 

139 

19321 

2,685,619 

5 

37? 

litO,625 

52,730,375 

6 

238 

56,610; 

13,081,272 

7 

176 

30,976 

5,051,776 

8 

238 

56,610; 

13,031,272 

9 

3U3 

117,61;9 

00,353,607 

10 

339 

11U,921 

38,958,219 

11 

218 

1;7,52U 

10,360,232 

12 

113 

12,769 

1,002,897 

13 

17U 

30,276 

5,268,020 

Hi 

282 

79,521; 

22,025,768 

■V 

15 

103 

10,609 

1,092,727 

16  . 

1U9 

22,201 

3,307,909 

17 

118 

13,921; 

1,603,032 

3UIO 


809,6oU 


220,618,96U 
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Table  2.5j  Estimate  of  the  Three  Parameters:  River  R,  Point  P 


(  1.) 

(  2.) 

(  3.) 
(  h.) 

(  5.) 
(  6.) 

(  7.) 
(  8.) 
(  9.) 

(10.) 

(11.) 

(12.) 

(13.) 

(1^.) 

(15.) 


Mean  drought  x  «  -  ■  200*^9 

17 

^  809, 60U 

Mean  square  5?  *  - -  =  47,623.76 

17 

Variance  ^  ^  =  (U7,623.76)-(200.^9)2  =  7,387.U2 

St.  Dev.  S  »  85.95 
s3  =  63U,9U8.75 

x3  =  220,6l8,961i  -  17  “  12,977,585 
«  x'^  -  3(x^)(x)  +  2x^  =  ii6l, 009*8 

SkeiTOess  VbT  *  m  «  0.7261 

^  3 

i  :  From  table  IV  reference  (2)  =  0.5357 
a 

i  =  (0.i:3h3)  (0.5357)  -  0.2327 

a* 

A(a)  :  table  IV  reference  (2)  =»  0.2269 
u  =  X  +  sA(a)  = 

*  200.59  +  (85.95) (0.2269)  =  220.09 
B(a)  :  table  IV  reference  (2)  =  2,02li.8 
sB(a)=u-:^  =  (85.95)  (2.021*8)  =  I7I1.03 

t  =  u  -  sB(a)  =  2.20.09  -  17U.03  =  U6.06 
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CHAPTER  III 


Analysis  of  Miniinum  Values  Using  Order  Statistics. 

3  >  1  Int  r  odn,  c  t  i  on 

In  January  19?U,  the  National  Advisory  Committee 
for  Aeronautics  in  the  United  States  published  a  paper  by 
Julius  Lieblein  (reference  (U))  outlining  an  entirely 
different  m.ethod  for  analysing  extrem.e  value  data*  However 
the  method  is  given  only  for  maximum  values  since  their  main 
concern  was  gust  loads  on  an  airplane  in  flight.  In  this 
chapter  this  method,  which  is  one  of  order  statistics,  will 
be  outlined  for  use  in  the  anal^rsis  of  minimum  values  5  in 
particular,  droughts  where  the  lower  limit  is  assumed  to  be 
zero. 

3.2  Ihe  method  of  order  statistics. 


Let  X  represent  drought  values*  Then  the 


probability  of  a  drought  more  severe  than  x  (that  is, 
numerically  smaller  than  x  )  is  given  by  (1.7)  with  ^  , 
the  lower  limit,  assumed  to  be  zero 


0<x<'»j  u>Oj  a  >  0 


However,  the  method  outlined  by  Lieblein  is  based 


on  the  assumption  that  the  observed  data  are  independent 
obser\’-ations  from  a  statistical  distribution  of  the  form  of  (1.5) 
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where  y  «  a  (x-u);  a>0,*  ^  «>  <  x  <  <»  . 

If  the  following  transformation  is  made  in  (3ol), 

(3.2)  ^“'1  =  or  Z  =  a(-ln  X  +  In  u) 

then 

Z 

P(X  <  «  P(u  e  <  x^)  =  P(ln  -  §  <  In  x^) 

(3.3)  *  P  j^-Z  <  a(ln  x^  -  In  u)"^  «  ^  afln  x^  +  In.  u) 

=  P  (Z  >  Zi)  . 

That  is,  the  probability  of  a  drought  X  being  less  than 
or  equal  to  ,  is  equivalent  to  the  probability  of  a  Z 
value  being  greater  than  or  equal  to  the  corresponding  • 

The  cum,ula.tive  distribution  function  of  the  new 
variate  Z  is  given  by 

(3oh)  P(Z  <  z)  -  exp  -e“^j 

where 


(3«ba)  z  *  a(-ln  x  +  In  u) 

w^hich  is  precisely  the  form  of  the  distribution  function  (1.?) 
considered  by  Lieblein  with  y  replaced  by  z,  x  by  -In  x, 
and  u  bjr  -In  u. 
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Therefore,  if  the  negative  logarithms  of  the 
droughts  are  considered  instead  of  the  droughts  themselves  the 
method  outlined  by  Lieblein  can  be  applied  directly. 

First,  a  combination  of  the  two  parameters  to  be 

estimated 

(3.5)  f  =  -In  u  +  I 

is  introduced.  Although  the  distribution  is  completely 
specified  by  the  two  param.eters  -In  u  and  i  ,  it  will  be 
shown  that  the  quantity  T  makes  it  possible  to  estimate  them 
simultaneously  and  not  as  two  separate  parameters. 

If  the  probability  P(Z  <  2  )  is  chosen  to  be 
some  fixed  value,  then  the  corresponding  z  value  can  be 
obtained  from  relationship  (3*^)  (tabulated  in  reference 
(3))*  Having  obtained  this  z.  value,  the  corresponding- 
value  of  T  can  be  obtained  from.  (3o^)*  That  is,  P 
having  been  fixed,  the  values  of  z  and  f  are 
automatically  fixed.  To  denote  this  dependence  of  z  and 
T  on  P  ,  they  will  be  written 

z  and  "T  , 

P  ^  P 

If  P  is  chosen  at  different  levels,  say 
P  *  0.10,  0.05>  OoOl,  etc.,  the  corresponding  p’^ 

■will  be  the  estimates  used  for  the  predictions  for  the 
negative  logarithms  of  the  droughts,  such  that  sm.aller  droughts 
will  occur  only  10,  5,  1,  etc,  times  respectively  in  100 
future  droughts. 
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It  is  by  the  proper  choice  of  P(Z  <  s)  that 

estimates  of  the  parameters  -In  u  and  —  are  obtained 

a 

from  the  value  of  T  *  If  P  is  chosen  to  be  i  -  0,36788, 
'  P  e 

it  is  evident  from  (3.h)  that  z  *=0,  Putting  z  *  0  in 

P  p 

equation  (3*5)  Yp 


z 

(3*6)  T*  «  -In  u  +  ~ 

p  a 


-In  u 


which  gives  the  required  estimate  for  -In  u*  Similarly, 

*  fvJtf' 

if  the  limiting  value  of  P  is  considered,  that  isAlet  P 

approach  one,  the  corresponding  values  of  X  and  z 

f  P  p 

become  indefinitely  large,  but  their  ratio 


(3.7) 


-In  u 


1 

a 


may  be  considered  to  be  a  new  parameter  which  approaches 


From  the  above  (iiscussion,  it  is  evident  that  the 
solutions  of  both  the  problems  of  estimation  and  prediction 
are  embodied  in  the  one  quantity 

X  *=  -In  u 

a 

and  estimation  of  this  quantity  will  be  the  main  problem 
dealt  with  here.  The  method  of  attack  will  be  that  of 
order  'Statistics* 


If  the  values  in  a  sample  of  N  observations  are 


arranged  in  say,  increasing  order  of  magnitude,  that  is. 
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x_  <  ••••••  <x  « 

1  -  2  -  - 

then  these  x^»s  are  called  order  statistics. 


Here,  the  observations  are  the  negative  logarithms 
of  drought  values  and  they  must  first  be  ordered  in  increasing 
magnitude,  such  that 


-In  x^  <  -In  Xg  < . < 


-In  x^  p 


The  aim  is  to  determine  the  weights  w^,  i  =  l,.o,n, 
for  all  the  n  order  statistics  so  that  the  linear  estimator 


(3.8)  L  »  Z  w.  (-In  X. ) 

i=.l  ^  ^ 


has  the  following  properties: 

(i)  The  mathematical  expectation  of  L  equals  the 
parameter  to  be  estimated. 

That  is. 


(3.9)  E(L)  =  Pp 

This  condition  makes  L  an  unbiased  estimator. 

(ii)  The  mean  square  error  (MSE),  which  in  this 

case  is  the  same  as  the  variance,  is  as  small 
as  possible,  consistent  with  condition  (i). 
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That  is 


(3.10)  MSE  (L)  =  T-s  (L)  -  E  I^L  -  E(L) 


a  minimum. 


For  each  value  -In  ,  there  corresponds  a  z.  of  the 
following  form  (from  (3,lja)) 


z.  =  a(-ln  X.  +  In  u) 
1  i  ^ 


Therefore , 


(3«11)  E(-ln  x^)  =  -In  ^  ~  E(z^) 

and  consequently 


(3ol2) 


^  r 

H  r 

E(L)  =  Z  w  E(-ln  X  ) 

*  Z  w. 

i=l  i  L  i  . 

i=l  ^  . 

■  Tp  “  "  * 


1 

—  z 
a  p 


This  is  required  to  be  an  identity  and  hence  if  the 
coefficients  of  -In  u  and  i  are  equated,  the  conditions 
on  the  weights  w^,  are  obtained  as  follows: 


N 

z 

i=l 


w.  *  1 
1 


(3.13)  and 

N 

Z  E(z^) 


z 

p 


I 


f 


e. 


'y:. 


I 
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The  values  E(z^)  have  been  tabulated  in  reference  (^)  . 
Turning  to  the  variance,  there  is  obtained 


(3.1U)  Var(L)  -  Z  w?  +  Z  Z 

i=i  ^  j=ii=i 


From  the  definition  of  -In  x.  in  terms  of  z.  and 

1  1 

utilizing  the  properties  of  variances  and  covariances  of 
linear  estimators  and  then  making  a  simplification  in  notation: 


Xi) 


whence. 


(3.16)  V^*Var.(L) 


Z  <37-^  w?  + 
i=^l  ^ 


^  U 

z  z 

i=l  pi 

i/j 


WiVj 


2 


«  a  minimum  subject  to  (3*9)  • 


This  is  a  constrained  minimum  problem  for  variation  in  the 
unknown  w^  and  is  equivalent  to  finding  the  unconstrained 


minimum  of: 
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(3*17)  G  «  /  Z  CTj-  2  ^ 


Z  Z  <jjr  w.W .  — 

id  ^ 

i  ri 


+  Xi/xw.  -  l\  +/<-,  Z  E(z.)w.  -  z  \ 

U  "  J  "  "  P) 

where  X  ^  end  yU.  ^  are  Lagrange  multipliers .  This  is  the 
same  as  minimizing 


(3.18) 


G  =  Gt  *  Z  <37  ^  w?  +  Z  Z  err:  w .  w . 


1  3 

i/j 


M  1  J 


I  2  w^-l  j  E(z^)wi  -  Zpj 


1  2 

since  —  >0  is  a  constant,  though  unknown.  Setting  the 

derivative  with  respect  to  equal  to  zero 


(3.19)  2  2  Wj^  +  2^  *  \  *  JA.  E(z^) 

i/k 


k  =  1,  2,..,  (/ 


(3.19)  is  a  system  of  l4  linear  equations  which  if 
combined  with  (3.13)  fo’^m  a  simultaneous  system  of  M  +  2 
equations  in  the  H  +  2  unknowns 

Wi>  w^,  ......  v^,  Xand  . 

Before  the  sets  of  equations  (3.13)  and  (3.19)  can  be 
solved,  the  coefficients  E(Zj^),  cyj^  ^  (jr^ 
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determined.  As  previously  stated  the  values  of  E(zj^)  are 
tabulated  in  reference  (5)*  The  variances  and  covariances  ^ 
and  involve  complicated  integrals  which  Lieblein  has 

expressed  in  tenns  of  simpler  ones  which  are  tabulated  in 
reference  (6).  These  mean,  variance,  and  covariance  values  are 
combined  into  one  table  —  table  III  of  reference  (I4.)  —  for 
values  of  n  up  to  and  including  n  «  6« 

This  table  gives  the  coefficients  in  the  equations 
(3.13)  and  (3.19)0  The  right  hand  sides  of  these  n  +  2 
equations  are 

1,  ..ooo.,  0, 

and  the  solutions 

w. ,  A  and  /X  , 

are  linear  combinations  of  these  with  numerical  coefficients 
which  involve  only  2,  crj  ,  and  E(z.)  but  not  z  « 

^  J  i  I-' 

Therefore  the  solutions  are  all  of  the  form: 

w  *  a.  +  b  z 
i  1  IP 

(3.20)  A  *  Cl  +  T  “  o.o...,  • 

M  «  c^  +  d_z 
/  2  2  p 
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Substituting  these  values  of  in  equations  (3.13)  and 
(3*19)  yields  a  solution  for  the  miniimun  variance  of  the 
following  form: 

(3.21)  \r^n-(Vp  *  Vp  ^ 

The  quantities  a^  and  for  the  weights,  and  the 

coefficients  A^,  of  are  all  given  in 

table  one^reference  (U)  for  M“2  to  <rf=6.  The  procedure 
for  samples  larger  than  6  is  explained  in  reference  (U) 
and  is  outlined  in  Appendix  B  of  this  thesis.  Having  obtained 
the  weights  w^^,  the  estimates  for  different  probabilities 

P  can  be  calculated  as  illustrated  in  the  example  given  in 
section  3.3  . 


Lieblein  has  made  an  important  extension  in  his  work  with 
extreme  values  by  including  methods  by  which  information 
concerning  the  mean  and  variance  of  the  estimator  T”  can  be 
obtained.  The  mean  value  of  an  estimator  indicates  whether  on 
the  average  the  estimate  given  is  too  high  or  too  low  relative 
to  the  parameter  estimated.  The  variance  makes  it  possible  to 
compare  the  performances  of  different  estimators  by  indicating 
how  much  the  estimators  scatter  among  themselves;  that  is, it  is 
a  basis  for  constructing  a  measure  of  efficiency  of  the  estimator. 


In  order  to  have  a  standard  of  comparison,  all  variances 
are  scaled  by  dividing  them  into  a  theoretically  specified 
variance  which  is  known  as  the  ’’Crct^er  -  Rao  Lower  Bound" 
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—  (reference  (7),  pg*  U80),  This  variance  is  less  than  or 
equal  to  the  variance  of  any  unbiased  estimator  of  the  parameter 
in  question. 

The  resulting  efficiency  is  an  absolute  number  between 
0  and  1,  and  is  given  by 

(3.^2)  Efficiency  (L)  =  E„(L)  = 

1“  Q*l 


where  L  is  the  estimator  and  .  The  quantities 

which  depend  on  (since  depends  on  z^)  and 

consequently  on  P,  are  tabulated  for  >1  =*  2  to  M  =  6  5 
for  different  probability  levels  P,  in  table  III  reference  (4). 

Lieblein  uses  the  standard  deviations  of  the  estimator 
Yp  to  establish  confidence  limits  around  the  predicted  values. 
For  a  fixed  probability  P,  the  interval 


(3.23) 


mil  contain  the  true  unknox^  parameter 

T"  1 

about  6Q%  of  the  time.  If  two  standard  deviations  are  used, 
the  percentage  rises  to  9$%* 


3o3  An  example  of  drought  analysis  using  order  statistics. 


The  drought  observations  given  in  table  3ol  were  obtained 
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at  Point  P  on  River  R  over  a  I7  year  period.  Since  there  are 
17  observations,  they  must  be  split  into  subgroups  according  to 
the  rules  given  in  Appendix  B.  Three  subgroups  are  obtained, 
two  consisting  of  6  observations  each  and  a  third  consisting  of 
5  observations.  The  negative  logarithms  of  the  droughts  are 
obtained  and  these  are  ordered  within  each  subgroup  in 
increasing  magnitude.  The  remaining  calculations  are  presented 
in  the  form  of  two  self  explanatory  work  sheets  suggested  by 
Lieblein. 

Table  3«1  :  Drought  observations  on  River  R  and  their 

negative  logarithms. 


Ir. 

Droughts 

X 

-In  X 

Yr. 

Droughts 

-In  X 

1 

76 

-ii.3307 

10 

169 

-5.1299 

2 

57 

-l;.01t31 

11 

113 

-U.727lt 

3 

51 

-3.9318 

12 

68 

-U.2195 

k 

50 

-3.9120 

13 

115 

-U.7Uii9 

5 

182 

-?.20li0 

Ih 

122 

-U  #8011.0 

6 

189 

-5.2UI8 

15 

52 

-3.9512 

7 

123 

-U.8122 

16 

50 

-3.9120 

8 

108 

-U.6821 

17 

80 

-U.3820 

9 


1U2 


-ii.9958 


d 
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Work  Sheet  1  : 


Subgroup  sizes  and  proportionality 

factors. 

H  = 

17  = 

km+mt  =  2x6+5 

t  = 

km 

t»  = 

< 

^  =  0.29itl 

M 

17 

17 

t2 

^  as 

k 

0.2it92 

(f  ) 

2  =■  0.0865 

k 

-  2  ;  m  =  6  ; 

m' 

=  5  . 

2,  (a)  Main  subgroups 


Weights 

a^  and 

b^  (from 

table  1, 

reference  (I4) ) 

i 

1 

2 

3 

U  5  6 

a. : 

1 

0.3555 

.2255 

.1656 

.1211  ,0835  .Olt89 

b.  : 

1 

-0.it593 

-.0360 

.0732 

.1267  .1U95  .11458 

-In 

in  increasing  order 

-In 

-In  x^ 

-In  x^ 

-In  x^ 

-In  x^  -In  xg 

1:  -5,2ltl8 

-5.20liO 

-14.3307 

-it.oitll 

-3.9318  -3.9120 

2:  -5.1299 

-it.  9958 

-14.8122 

-14.72714 

-lt.6821  -I4.2195 

6  6 

T  *  Z  a.x.  Z  bixi 

-  Z  =  -it. 8)401  +  O.I1386  z 

k  k  P  P 

(b)  Remainder  group 

Weights  aj^  and  b^  (from  table  1  reference  (U)) 

i  i  £  2  k  i 

at:  O.U189  0.2U63  0.1676  O.IO88  0.058U 

i 

b*  -0.5031  0.0065  0.1305  O0I817  0.18U5 

i 


t 


I  ■, 


i 


1 
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¥ork  Sheet  1  (Cont’d.) 


-In  x!  in 
1 

increasing  order 

-In 

-In  x^ 

-In  x^ 

-In  X’ 

4 

-In  X » 

5 

-li.SoUO 

-U.7lUt9 

-)u3820 

-3.9512 

-3.9120 

5 

T'  *  2 

i=l 

5 

ax*  +  Z  b*x* 
i  i  i  i  P 

=  -U.5738 

+  0.37U5 

Therefore 

t  =  tT  + 

'  p 

fp  =  (0.7059)  (-u.8to  +  O.U386  z^)  +  (0.2911)  (-U.5738  +  0.37U5  z^) 
-  U.7618  +  O.I4I97 

Jr 

and  the  estimates  for  -In  u  and  —  are: 

a 

-In  u  =  -Uo76l8  and  i  *  O.U197 

a 
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3*U  The  general  case  where  the  lower  limit  is  not  zero. 


Let  X  be  a  random  variable  representing  drought  values. 
The  probability  that  a  drought  more  severe  (that  is  numerically 
smaller)  than  x  will  occur  is  given  by 

(3.2U)  P(X  <  x)  =  P(x)  =  1  -  exp 

g<x<«»;  u>£.;  a>0;  E>0. 


where 


are  parameters  which  must  be  estimated* 


If  this  case  is  to  be  treated  using  the  method  of  order 
statistics  outlined  in  section  3*2,  (3*214-)  must  be  put  in  the 

fonn 


P 

a  F(x)  =  exp 

_^-a(x  -  u) 

«  exp 

-e-3^ 

The  transformation  linking  these  two  cujmulative  distribution 


functions  is 


(3*25) 

or 


in  (3.2k) 
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An  effort  was  made  using  order  statistics  to  obtain  a 

method  that  would  yield  unbiased  estimates  simultaneously  for 

the  three  parameters  u,  i,  £  ,  but  this  was  unsuccessful* 

a 

Instead,  the  following  "combined"  method  is  proposed  to  handle 
the  case  where  the  lower  limit  is  not  zero* 

First  the  lower  limit  €  is  estimated  by  the  method  of 
moments  outlined  in  section  2*U*  If  this  estimate  of  €  is 
then  subtracted  from  each  of  the  original  observations  on  X  , 
the  cumulative  distribution  function  (3.^U)  reduces  to  one  of 
the  form  (3.1)  in  the  new  variable  say 


and  the  two  parameters  u  =  u  -  g  ,  and  —  * 

(1) 

Table  (3.3)  gives  the  estimate  of  u  and  i 

a 

obtained  by  appljring  this  combined  method  to  drought 
observations  on  River  R  over  a  period  of  1?  years*  Since  more 
than  one  set  of  data  was  desired,  the  years  were  split  up  into 
quarters  and  the  droughts  during  each  quarter  were  analysed* 
For  the  purpose  of  comparison  the  estimates  of  u  and  i 
obtained  by  the  method  of  moments  on  the  same  data  are  also 
given  in  table  (3.3). 
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Table  3o3 


Quarter 

Method 

of  Moments 

Combined  Method 

u 

], 

a 

u 

1 

a 

1 

U25.7 

0.590U 

U0.3 

0.H026 

2 

220.1 

o.53?7 

226.0 

0.5611 

3 

116.2 

0,li68U 

117.1 

o.lt665 

U 

1U2.8 

0.90I18 

IU2.3 

0.7516 

One  of  the  rather  serious  disadvantages  in  applying  the 
method  of  moments  to  the  case  where  Z  is  not  equal  to  zero  is 
that  confidence  intervals  for  the  predicted  droughts  are 
extremely  difficult  to  obtain^  in  fact  there  is  no  method 
available  at  this  time  by  which  they  can  be  obtained.  This 
disadvantage  is  partially  overcome  if  the  combined  method  is 
used,  as  approximate  confidence  inte3n/als  can  be  obtained  for 
the  predicted  values  of  -ln(x  -£  )  and  these  can  be  converted 
into  approximate  confidence  limits  for  the  actual  predicted  value So 

Table  3.U  gives  the  predicted  droughts  with  return  periods 
10,.  20,  and  100  years  (denoted  by  and 

respectively)  both  for  the  m.ethod  of  moments  and  the  combined 
method*  In  addition  the  confidence  band  half -widths  for  the 
predictions  of  -ln(x^  -  £)  (n  -  10,  20,  or  100)  obtained  by 
the  coRibined  niethod  are  given  along  with  tne  confidence  liiriits  for 
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the  predicted  droughts.  It  must  be  kept  in  mind,  that  these 


confidence  limits  are  only  approximate,  since  there  is  no  control 
on  the  estimate  used  for  8  * 


Table  3.U 


n 

-ln(Xn-  £ ) 

Predicted 

Method 

of 

Moments 

Values 

Combined 

Method 

Approx.  68% 
conf.  band 
half-widths 
for  -ln(x-  £  ) 

Approx.  68^ 
conf.  limits 
for  the 
predicted  x^ 

1st  Quarter 

10 

-.iio8806 

117.6 

1U2.7 

0,2938 

109.3-187.7 

20 

-U.5260 

81.6 

103.6 

0.36U9 

75.3-lUlt.3 

100 

-3.7230 

38.5 

52.5 

0.5323 

35. U-  31.6 

2nd 

Quarter 

10 

-3.9268 

96.8 

96.6 

0.3336 

82.6-117.0 

20 

-3.5229 

81.0 

80.0 

0.i*l57 

68.5-  97.U 

100 

-2.6083 

60.9 

59.7 

0.6063 

53.5-  71.0 

3rd  Quarter 

10 

-3.6331 

li5.U 

1;6,7 

0,2782 

37.6-  58.9 

20 

-3.2973 

35.2 

36.0 

0.3to6 

28.0-  47.1 

100 

-2.5369 

21.3 

21.5 

O.50UI 

16.5-  29.8 

Uth 

Quarter 

10 

-3.0i;92 

U2.2 

U9.0 

0.UU83 

41.4-  61,0 

20 

-2.5032 

35.5 

to, 2 

0.5568 

34.9-  49.4 

100 

-1,2831 

29.7 

31,5 

0.3121 

29.5-  36.0 

c. 


0  '  •, 


1 
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chapter  IV 


Conclusion 

Four  ii:ethods  have  been  presented  to  deal  with  the  problen 
of  analysing  minimum  values.  The  first  w^as  a  graphical  method 
which  utilized  a  special  probability  paper j  the  second  was 
based  on  the  classical  method  of  moments;  the  third  used  order 
statistics  to  deal  with  the  special  case  where  the  lower  limit 
was  assumed  to  be  zero;  and  the  fourth  combined  the  methods 
of  moments  and  order  statistics  to  handle  the  general  case  where 
the  lower  limit  is  assumed  to  be  some  positive  number.  In  this 
chapter  a  brief  discussion  of  these  four  methods  will  be  given. 

The  graphical  m.ethod  presented  in  section  2,2, 
although  very  simple  and  compact,  has  one  rather  serious 
disadvantage.  The  plotted  points  do  not  tend  to  cluster  around 
one  straight  line  as  Gumbel  claim.s  they  trill.  Rather  they  seem 
to  fom  two  lines  as  is  illustrated  in  figure  U.l,  which  is  a 
graph  of  the  same  observations  used  in  figure  2,1.  The  upper¬ 
most  line  in  figure  U.l  is  interpreted  as  being  formed  by 
moderate  droughts  which  do  not  belong  to  the  extreme-values 
proper,  but  still  to  the  initial  distribution.  The  second 
line  is  formed  by  more  severe  droughts  and  only  it  can  be  used 

for  extrapolation  purposes.  This  discontinuity  leads  to  a  loss 

1 

of  about  37%  of  the  information  furnished  by  the  observations, 

1 

This  percentage  is  quoted  by  Gumbel  in  reference  2,  It  appears 
that  all  the  observations  larger  than  the  "characteristic  drought" 
u  (see  section  2, 14)  have  to  be  discarded. 
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The  method  of  moments,  the  second  method  proposed  by 
Gumbel,  corrects  this  loss  of  information  by  recognizing  the 
fact  that  one  must  assume  droughts  to  be  extreme  values  from  a 
limited  (to  the  left)  distribution.  This  of  course,  gives  rise 
to  a  third  parameter  —  the  lower  limit  —  wlrLch  has  to  be 
estim.ated. 

As  can  be  seen  by  equation  (2.3U)  the  estimate  given 
by  the  method  of  moments  for  the  lower  limit  is  dependent  on 
the  standard  deviation  of  the  sample  so  that  the  sm.aller  the 
variation  within  the  sample  the  larger  the  estimate  for  the 
lower  limit.  Du.e  to  this  fact  it  is  quite  possible  for  a  river 
with  very  low  (more  severe)  observed  droughts  to  yield  a 
higher  estimiate  for  the  lovjer  limit  than  one  with  higher  (less 
severe)  observed  droughts.  The  estim.ated  lower  limit  may  even 
turn  out  to  be  larger  than  the  smallest  obseinred  value.  If  the 
latter  value  is  reliable,  the  method  fails.  Another  possibility 
which  would  cause  the  method  to  fail  would  be  for  the  lower  limit 
to  take  on  a  large  negative  value. 

The  third  method,  that  of  order  statistics,  is  outlined 
in  chapter  III  for  application  when  the  lower  limit  is 
assumed  to  be  zero.  A  comparison  between  this  method  and  the 
method  of  moments  is  given  in  reference  (Ir) .  The  comparison 
is  carried  out  after  first  combining  the  estimates  for  u  and 
a  given  by  the  method  of  moments  into  one  estimator  (which  will 


be  referred  to  as  the  "moment’s  estimator")  that  has  similar 
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form  to  that  of  the  order  statistics  estimator  given  by 

relationship  (3o?)*  The  main  interest  is  to  compare  the 
efficiency  of  the  two  estimators*  In  order  to  obtain  these 
efficiencies,  the  first  two  moments  of  the  sample  m.ean  and 
standard  deviation  and  the  covariance  of  the  mean  and  standard 
deviation  must  be  obtained*  For  the  moment’s  estimator,  only 
the  first  two  moments  of  the  sample  mean  are  readily  obtainable 
by  standard  procedures.  Therefore,  the  comparison  is  carried 
out  using  a  simplified  form  of  the  moment's  estimator  which 
is  valid  only  for  large  samples*  However  it  is  shown  that 
the  original  moments  est5jnator  is  much  less  efficient  than 
the  one  considered*  From  this  comparison  the  following 
advantages  of  the  order  statistics  estimator  seem  apparent. 

(a)  The  method  of  order  statistics  provides  an  estimator 
known  to  be  unbiased,  whose  efficiency  can  be  simply  and 
accurately  evaluated* 

(b)  The  estimator  is  more  efficient  than  a  simplified  foim 
of  the  moment’s  estimator,  for  samples  of  about  20  or  more  and 
probability  P  =  0*95  and  more. 

(c)  The  order  statistics  method  uses  a  more  exact  procedure 
to  obtain  the  reliability  of  predicted  values,  and  this 
procedure  yields  smaller  confidence  intervals  in  many  cases. 

The  following  two  limitations  on  the  method  of  order  statistics 


should  be  noted 
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(a)  The  method  is  applicable  only  when  the  assujnptions  on 
which  it  is  based  are  considered  to  be  appro>d.mately  satisfied 
namely,  the  observations  constitute  an  independent  sample  from 
the  population 


F(x) 


exD 


-e 


■a(x-u) 


(b)  The  method  of  order  statistics  treats  each  observation 
on  an  individual  basis,  and  hence  is  not  very  suitable  for 
large  samples  since  they  cannot  be  grouped* 


The  combined  method  outlined  in  section  is  a  rather 
obvious  eo>y\loi7i<t,t»cY\of  the  methods  of  moments  and  order 
statistics*  However  it  has  the  advantage  that,  for  the  first 
time,  confidence  limits  are  obtainable  for  the  predicted 
droughts,  even  though  they  are  approximate*  The  predicted 
values  compare  very  well  with  those  obtained  by  the  method  of 
moments  with  the  exception  of  those  for  the  first  quarter 
(see  table  3*3) »  As  stated  previously  in  thiis  chapter  the 
method  of  moments  estimate  for  the  lower  limit  depends  on  the 
variation  within  the  observed  sample,  and  consequently  the 
predicted  values  tend  to  be  either  too  high  —  for  a  small 
sample  variation  —  or  too  low  for  a  large  sample  variation* 
Since  the  variation  within  the  first  sample  is  rather  large 
(see  table  2,1)  it  would  seem  logical  (based  on  the  above 
discussion)  to  expect  the  predicted  droughts  for  this  quarter, 
by  the  method  of  moments,  to  be  too  low.  As  can  be  seen  by 
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table  3*3  ^11  the  predicted  values  for  the  first  quarter 
calculated  by  the  combined  method  are  higher  than  those 
obtained  by  using  the  method  of  moments.  Since  the  predicted 
values  for  the  other  three  quarters,  where  the  variations  are 
not  abnormally  high,  are  quite  comparable,  it  would  seem  that 
this  combined  method  tends  to  give  more  accurate  estimates  for 
samples  with  large  variations,  although  the  degree  of  accuracy 
is  not  known  and  further  investigation  is  needed  on  this  point, 

A  natural  extension  of  this  investigation  into  the  analysis 

of  minimum  values  would  be  to  obtain  a  method  that  would  give 

unbiased  estimates  for  u,  i,  and  £  simultaneously  in  such 

a 

a  way  that  the  efficiency  of  the  estimators  could  be  obtained 
without  too  much  difficulty.  An  attempt  was  made  to  accomplish 
this  by  applying  the  method  of  maximum  likelihood  and  also  by 
trying  to  extend  the  method  of  order  statistics,  but  both  were 
unsuccessful.  However  there  is  certainly  scope  for  further 
investigation  into  both  these  methods. 
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APPENDIX  A 
Probability  Paper 

Let  X  be  a  continuous  random  variable,  unlimited  in 
both  directions  and  having  cumulative  distribution  function 

P(X  <  x)  =  F(x) 

Assume  the  existence  of  a  linear  transformation 

(A.l)  X  =  +  p  y 

where  jll  and  p  are  location  and  scale  parameters  respectively, 
both  having  the  dimension  of  x  .  The  new  variable  y  ,  known 
as  the  reduced  variable,  has  dimension  zero,  A  well  known 
example  of  such  a  transformation  is  used  in  standardizing  a 
normal  variate  by  putting 

(A,2)  z  = 

where  /*•  and  <r  are  the  population  mean  and  standard  deviation 
respectively. 

If  the  values  of  the  variable  X  are  plotted  against 
the  reduced  variable  y  ,  a  straight  line  would  naturally 
result  since  the  relationship  between  them,  is  linear.  However, 
the  problem  arises  as  to  how  the  y  value  corresponding  to 
a  particular  x  is  arrived  at,  since  and  p  are  parameters 


that  are  unknown 
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Corresponding  to  the  cumulative  distribution  function 
F(x)  of  X  ,  there  is  a  curriLilative  distribution  function  of 
y  >  say  f  (y)  and 

(■^•3)  J  (y)  =  F(x)  ♦ 


The  important  point  here  is  that  $  (y)  is  independent  of 
the  parameters  /x  and  p  ,  Therefore,  if  an  estimate  of 
(y)  could  be  obtained,  the  corresponding  y  value  would 
automatically  be  known.  To  estimate  (y)  an  estimate  of 
F(x)  is  obtained  from  the  observations  on  X  as  follows: 


Let  X.  <  X  <. . <  X.,  , 

r  "  2  -  -  rJ  ^ 

be  W  observations  on  the  variate  X  ,  assumed  to  have  the 
cumulative  distribution  function  F(x)  ,  ordered  in  increasing 
magnitude.  Then  the  mth  value  x  has  the  density  function 


(A.U)  g(x„) 


N  1 

1  f(x)ch: 

m-1 

/  f(x)dx 

(K-m)l  (m-l)i 

"  J 

.’Sn 

N-m 


Let  fjj  be  the  proportion  of  the  population  f(x)  preceding 
,  that  is, 


(A.5)  y, 


/ 


m 


f (x)dx 


Clearly  1  >  >  0  . 


Then  the  density  function  of 


is 
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(A.6) 


^  (  T,)  = 


K  1 


(N-n)j  (ra-l)i 


(In.)”’'"  (1  -  Tj”'" 


The  expected  value  of  is 


(A.7)  E(JJ  = 


N  i 


(M-m)j  (m-l)l  J 
0 


m 


N  +  1 


m.  .N-m  __ 

(TJ  (1  -  TJ  d  ) 


That  is,  the  average  proportion  of  the  population  f(x) 


precedi.ng  the  mth  value  x 


IS 


m 


,  and  this  average 
Hence, 


N  +  1 

proportion  is  taken  as  the  estimate  used  for  <!  (y) 
corresponding  to  each  observation  x^  ,  there  is  an  estimate 
of  the  currralative  distribution  function  F(x)  =  i  (y)  ,  and 
therefore  an  estimate  of  y  . 


Probability  paper  is  a  rectangular  grid  on  which  the 
variate  X  is  plotted  on  one  of  the  axes  -  usually  the  vertical  - 
on  a  linear  scale*  The  other  axis  is  scaled  in  such  a  way  that 
if  the  estimates  for  the  cumulative  distribution  function  F(x) 
are  plotted  against  the  x»s  ,  a  straight  line  will  result* 

This  enables  one  to  obtain  the  theoretical  straight  line 


X  = 


r 


+  P  y 


and  to  estimate  the  parameters  px  and  p  (by  ordinary  regression 
procedures)  without  ever  actually  obtaining  the  y  values. 
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If  the  observations  on  X  are  ordered  in  decreasing 
magnitude  as  in  section  2.2  ,  that  is 


X  >  X  > 

1  ~  2  - 


the  same  probability  paper  can  be  used  if  the  proportion 

estimated  is  1  -  F(x)  instead  of  F(x)  ,  As  an  estimate 

for  this  quantity,  the  average  proportion  of  the  population 

f(x)  exceeding  the  mth  value  (from  above)  is  used.  This 

average  is  found  to  be  — Hi —  ,  which  is  used  in  section  2.2 

N  +  1 

As  an  example  of  the  use  of  probability  paper,  consider 
a  variate  X  distributed  normally  with  mean  and  standard 
devi^ation  or  .  The  cumulative  distribution  function  of  this 
variate  is  given  by 


(A.8) 


P(X  <  Xi)  =  F(x^)  « 


exp 


(x 

2<r2 


The  linear  transformation  here  is 


(A.9)  i 

X  -  A 

3  * 

0~ 

and  (A, 8) 

becomes 

(A.IO) 

f  (zi)  =  j 

«*O0 

-i  exp 

2 

dx 


which  is  free  of  parameters  and  has  been  tabulated. 
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If  one  considers  a  sample  of  N  observations  from  this 
distribution,  ordered  in  decreasing  magnitude 

2 . .  ^  s  “ 

then  the  value  —  ,  which  is  the  expectation  of  the 

proportion  of  the  population  exceeding  ,  can  be  used  as 
an  estimate  for  1  -  F(x)  .  If  the  points  (  —  ,  x^  ) 

are  plotted  on  normal  probability  paper  they  should  cluster 
around  the  straight  line 

/V  A 

X  “  A*-  +  p  z 

where  ^  and  p  are  estimates  of  the  population  mean  and 
standard  deviation,  and  are  obtained  by  ordinary  regression 
procedures o 
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APPENDIX  B 


Extension  to  larger  samples. 

Most  samples  are  larger  than  the  trivial  size  of 
sixo  The  following  will  outline  how  these  larger  samples  are 
to  be  handled.  The  principle  is  to  treat  them  as  sets  of 
subgroups  of  six  (or  five).  Two  cases  arise: 

Case  I  :  Sample  size  an  exact  multiple  of  5  or  6. 

Let  the  sample  size  n  *  km  ,  where  m  is  the  size 
of  the  subgroup;  and  k  is  the  number  of  subgroups  in  the 
sample.  Now  each  subgroup  is  treated  as  a  separate  sample  of 
size  m  .  This  is  legitimate  if  the  original  sample  is 
divided  into  subgroups  in  such  a  way  that  each  subgroup 
consists  of  statistically  independent  observations. 

From  each  subgroup  a  "subestimator”  is  formed: 

ra 

(B.l)  T.  =  2  w.x^  i  *  1,  2,  . . .  k  . 

i  j=i  1  i 

where  the  weights  w.  are  obtained  as  in  chapter  III  and 
0 

are  the  same  for  each  subgroup  of  size  m  .  The  arithmetic 
mean  of  these  k  subgroup  estimators  T^  is  then  taken  to 
be  the  grand  sample  estimator: 

1 

(B.2)  T  =  -  .2  T. 

L=1 
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The  variance  of  T  is  given  by 


(B.3)  Var.  (T)  =  ^ 


since  this  variance  is  that  of  a  mean  of  k  independent 
quantities,  each  of  which  has  the  same  variance  (given 

in  table  III,  reference  U). 

The  efficiency  of  T  is,  since  n  »  km  and  the 
Tj^^s  and  therefore  T  ,  are  unbiased: 


(B.U)  Eff.  = 


Var(T) 


where  is  the  S«*eiQar-Rao  lower  bound,  which  can  be 

obtained  from  table  III,  reference  U*  Since  the  efficiency 
depends  only  on  the  size  m  of  the  subgroup  and  increases 
with  increased  m  ,  the  largest  size  of  subgroup  should  be 
chosen  if  there  is  a  choice® 

Case  II  j  Sample  size  not  an  exact  multiple  of  5  or  6* 

The  aim  of  course,  is  to  establish  as  simple  rules 
as  possible  without  too  great  a  loss  in  efficiency*  Actually 
two  separate  cases  arise: 

(a)  For  n  «  7  up  to  large  values: 


(i)  Use  the  partition  n  *  6k  +  m’  if  the 

remainder  m’  ®  2,  3^  or  5*  If  iJi’  ®  Ij 
use  5k  +  m"  « 
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(ii)  If  n  is  a  multiple  of  30  plus  1,  that 
is  n  *  31,  61,  91,  etc*,  write 
n  -  30k  +  1  -  (30k  -  $)  +  6  =  5 (6k  -  1)  +6 

that  is,  split  the  sample  into  6k  -  1  subgroups 
of  5  and  a  remainder  subgroup  of  6* 

A 

In  order  to  obtain  the  estimator  {  and  its 

^  P 


variance,  assume  the  sample  has  been  split  into  two  parts, 
one  consisting  of  k  equal  subgroups  of  size  m  ,  and  the 
other  consisting  of  the  remainder  subgroup  of  size  m».  The 
average  T  of  the  first  k  subgroups  is  found  as  outlined 
in  case  I.  Then  a  subestimator  T*  is  found  from  the 
remainder  subgroup  by  using  the  weights  w^*  for  a  sample 
of  size  m*  ,  that  is 

m» 

(B.?)  T»  =  2  ^i’  ^i'  • 

i®l 

Finally,  a  weighted  average  of  T  and  T*  is 


found  and  this  is  taken  as  the  final  estimator  X 


P 


A 


(B.6)  Xp  * 


where 


(B.6a)  t  = 


km 
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Since  all  the  subgroups  are  independent,  and  hence 


T  and  T’,  and  since  the  variance  of  the  mean  is 
therefore. 


(Bo7) 


var.  (  tp) 


(f 


m 


I 


The  efficiency  can  be  obtained  in  the  same  way  as  outlined 
in  case  I. 


(b)  n  extremely  large ; 

If  the  number  of  subgroups  is  of  the  order  $0 
to  1000,  the  amount  of  computation  becomes  very  laborious. 
The  folloT-fing  short  cut  method  is  suggested  to  deal  with 
these  cases.  Although  there  is  quite  a  large  loss  in 
efficiency,  the  method  is  of  practical  value  in  as  much  as 
a  loss  in  efficiency  is  effectively  a  loss  in  sample  size, 
which  is  not  too  important  if  an  extensive  amount  of  data  is 
available . 


First  arrange  all  n  observations  in  order  of 

increasing  size,  and  then  rank  them  from  one  to  n.  Select 

the  three  observations  x  whose  ranks  are  the  nearest 

r 

integers  to  0.03n  ,  0«,20n  and  0.85n.  These  will  be 

denoted  by: 

X  ,  X  ,  and  x  . 

0.03n  0.20n  O.S^n 

A 

The  predicted  values  ^  ,  for  various 

probability  levels  P  ,  can  be  computed  from 


::v.>ria:'  ;  , :]t. vli^oa n:;:.  ur^--,.  K'-.  .lI-^^  t;-jr.'. 


n 


{,-'n  -!:j'-  4;o  M'- :■  ;;'x  r  '^riu  sor-^ri  l.i.  .  ^ 'T 

^  -v/o?  '^-rar'-t 


I .. 


•!‘ 

n 


':.B'v 


;■  3iTrXj:>'c  0  '  .  •:'■'<■  '  ''u-.. <"■  .:'>  -i  : 

»  ;  nr 

;  ■  :  Z  •'.'':r-  Vi' :•  •••'  r 
'’-■’  “ruiv:;;''  ''-i:'''  'rc  ■.  .■  '..fu":.  r, '■  '.  ‘■"''  -'7  '  .cf  ■.■. 

^  -  r;':o':;n.r  J  <'ri'K:c^'r^  ■  -  '  ,  ■>";';iu'  ’  ••  I  -7^ 

Ir  v-  i -i,-;  >.')•:;;■■ -3  r  ‘  bO' ■■’•■■'  ;^l'.;o I ''o'- 

ii  ^^::or  T;-vr  -  bZ:  :.n:  :•>•;-■•...•■ 

•■'  -o:..'!,  p:..  rZ.  pu;'  V  Z:'p.'-  :.' 'i'  oX  -■'  ^  xZ.r .' 

^  .r' ‘  p  v;Icf;x:p  r:.l'  "  •'fb  ■•■.■  ' •  ‘■ybr  ^  \  oj.  'j-p ‘  '.b -?  px  ppoI  g 

c±  Bb-y?  V,/'  rioforrx;  ov.r'Gu'K;xp  :  .r  'T'GhiorfVG:  ooZ'  jon  g"  p.oj'v';>g 

,  *  *  ' 
lo  '.'pr  ro  xgZ  pfoxuSv-x^^atG)  n  e-iGPri- 

d-poGG-’  *  ■'  o&  Gcy;..  G-'::..;  g  p'r  br  ■  ^^jIc  -  g.Zg  ■DP:onx 

y€!Gx:.‘pp  sido'  Gyn  c>fi:r:3'i,  ->co;ivv  ^rfd' 

GG  III  •;■  gg:g' ^  bnn.  ^  rFO,''  od 

:  J  bedoGoi. 

.  t  ..  ^  '  '  '  , 

.  .■r.^-  .  ^  nc:'/'  ni:  /  ■  i 

: -d ■;  ,■  ■  A '  ., 

'  I  ,  ■A'oI*-G^"r  Gol  .  T  SGtrlj^v  b  tlnlbs'xq  GuT 

q  <  o 

,bD.tfK?r?oo  -^J  n.SG  ^  ■■  vlsvel  ■ 'VGrlldnl- v:r 


-73- 


<»•=>  Tp  -  •  0.3256  .  0.6759) 

(See  ref.  U) 

The  variance  of  this  estimator  can  be  computed  from 


(B.9)  =*  806916  d2  -  O0O68I  d  +  lo?iiU2 

d  »  0.3256  Z  +  0.15U9  . 
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